A Fast Generic Sequence Matching Algorithm

نویسندگان

  • David R. Musser
  • Gor V. Nishanov
چکیده

A string matching—andmore generally, sequence matching—algorithm is presented that has a linear worst-case computing time bound, a low worst-case bound on the number of comparisons (2n), and sublinear average-case behavior that is better than that of the fastest versions of the Boyer-Moore algorithm. The algorithm retains its efficiency advantages in a wide variety of sequence matching problems of practical interest, including traditional string matching; large-alphabet problems (as in Unicode strings); and small-alphabet, long-pattern problems (as in DNA searches). Since it is expressed as a generic algorithm for searching in sequences over an arbitrary type T , it is well suited for use in generic software libraries such as the C++ Standard Template Library. The algorithm was obtained by adding to the Knuth-MorrisPratt algorithm one of the pattern-shifting techniques from the BoyerMoore algorithm, with provision for use of hashing in this technique. In situations in which a hash function or random access to the sequences is not available, the algorithm falls back to an optimized version of the Knuth-Morris-Pratt algorithm. key words String search String matching Pattern matching Sequence matching Generic algorithms Knuth-Morris-Pratt algorithm BoyerMoore algorithm DNA pattern matching C++ Standard Template Library STL Ada Literate programming

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A New RSTB Invariant Image Template Matching Based on Log-Spectrum and Modified ICA

Template matching is a widely used technique in many of image processing and machine vision applications. In this paper we propose a new as well as a fast and reliable template matching algorithm which is invariant to Rotation, Scale, Translation and Brightness (RSTB) changes. For this purpose, we adopt the idea of ring projection transform (RPT) of image. In the proposed algorithm, two novel s...

متن کامل

gpALIGNER: A Fast Algorithm for Global Pairwise Alignment of DNA Sequences

Bioinformatics, through the sequencing of the full genomes for many species, is increasingly relying on efficient global alignment tools exhibiting both high sensitivity and specificity. Many computational algorithms have been applied for solving the sequence alignment problem. Dynamic programming, statistical methods, approximation and heuristic algorithms are the most common methods appli...

متن کامل

Adaptive search area for fast motion estimation

In this paper a new method for determining the search area for motion estimation algorithm based on block matching is suggested. In the proposed method the search area is adaptively found for each block of a frame. This search area is similar to that of the full search (FS) algorithm but smaller for most blocks of a frame. Therefore, the proposed algorithm is analogous to FS in terms of reg...

متن کامل

Fast Computational Four-Neighborhood Search Algorithm For Block matching Motion Estimation

The Motion estimation is an effective method for removing temporal redundancyfound in video sequence compression. Block Matching algorithm has been widely used in motion estimation and a number of fast algorithms have proposed to reduce the computational complexity of BMA. In this paper we propose a new search strategy for fast block matching based on Four-Neighborhood Search (FNS) and fast com...

متن کامل

Measurement of Left Ventricular Myocardium Wall Instantaneous Motions with Echocardiographic Sequence Images

Background & Aims: One of the important aims of quantitative cardiac image processing is the clarification of myocardial motions in order to derive biomechanical behavior of the heart in the disease condition. In this study we presented a computerized analysis method for detecting the instantaneous myocardial changes by using 2D echocardiography images. Methods: The analysis was performed on th...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/0810.0264  شماره 

صفحات  -

تاریخ انتشار 1998